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Abstract: We developed a three-step classification approach for forest 
road extraction utilizing LiDAR data. The first step employed the IDW 
method to interpolate LiDAR point data (first and last pulses) to achieve 
DSM, DTM and DNTM layers (at 1 m resolution). For this interpolation 
RMSE was 0.19 m. In the second step, the Support Vector Machine 
(SVM) was employed to classify the LiDAR data into two classes, road 
and non-road. For this classification, SVM indicated the merged distance 
layer with intensity data and yielded better identification of the road 
position. Assessments of the obtained results showed 63% correctness, 
75% completeness and 52% quality of classification. In the next step, 
road edges were defined in the LiDAR-extracted layers, enabling accu¬ 
rate digitizing of the centerline location. More than 95% of the Li- 
DAR-derived road was digitized within 1.3 m to the field surveyed nor¬ 
mal. The proposed approach can provide thorough and accurate road 
inventory data to support forest management. 
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Introduction 

Because forest roads are important in forest management, it is 
indispensable to have a detailed map of every forest road for 
providing corridors for travel, recreation and education, infra¬ 
structure for fire protection and transport of forest products 
(White et al. 2010). Roads are also an important data layer in 
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Geographical Information Systems (GIS) (Song and Civco 2004). 
Recent research on automatic road extraction is mainly moti¬ 
vated by the importance of GIS and the need for data acquisition 
and updated procedures (Hinz and Baumgartner 2003). Because 
of its low cost, surveying by GPS is currently the most suitable 
method for updating infonnation of Iran forest roads (Abdi et al. 
2012) but the method is usually time-consuming and inaccurate. 
Concerns regarding the use of GPS in forests include problems 
such as availability of satellite signals under a forest canopy and 
satellite characteristics (Rodriguez-Perez et al. 2007). 

Automated road extraction using remote sensing data can save 
time and labor costs in updating a forest road database. Mapping 
the land surface using an airborne Light Detection and Ranging 
(LiDAR) system is the most accurate method (Gallay et al. 2012). 
This technology can be acquired with high frequency and accu¬ 
racy in a short time (Baltsavias 1999). LiDAR is a reliable tech¬ 
nique for collecting elevation data of several surface levels de¬ 
pending on the penetration of the laser beam to the ground. 
These data on elevation can be used to generate a Digital Eleva¬ 
tion Model (DEM), and the recorded intensity of the backscat- 
tered laser beam can be used for classification of surface objects. 
In vegetated areas, the first returns generally correspond to the 
upper landscape canopy level (e.g., vegetation tops) and the last 
returns correspond to the terrain surface. The first returns are 
used to generate Digital Surface Models (DSM), while the last 
returns are used for generation of Digital Terrain Models (DTM) 
(Gallay et al. 2012). 

DEM data are commonly in raster fonnat; they are created us¬ 
ing point files and can be interpolated using many different tech¬ 
niques. The techniques used to create DEMs range from simple 
(e.g., nearest neighbor) to complex (e.g., kriging) gridding rou¬ 
tines and they can create slightly different surface types. 

The most common types are surfaces created by the TIN or the 
Inverse Distance Weighted (IDW) routines. The method most 
appropriate to interpolation depends on the desired use of the 
DEM and the data. The IDW function should be used when the 
set of points is dense enough to capture the extent of local sur¬ 
face variations needed for analysis (Liu 2008). The IDW, using a 
linear-weighted combination set of sample points, calculates cell 
values. The assigned cell weight is a function of the distance of 
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the input point from the output cell location. The greater the 
distance the less influence of the cell on the output value. 

Detailed background on LiDAR can be found in Baltsavias 
(1999) and Pfeifer and Briese (2007). The classification of Li¬ 
DAR data into objects such as road, tree and building in a forest 
area has been a challenging task in remote sensing studies (White 
et al. 2010; Feret and Asner 2012; Bandara et al. 2011). Several 
road extraction methods (e.g., maximum likelihood, neural net¬ 
works and decision tree classifiers) have been proposed for clas¬ 
sifying LiDAR data. Boggess (1993) used a classification 
method that incorporated texture and neural networks in classifi¬ 
cation of roads and other features from Landsat TM imagery but 
obtained numerous false-inclusions. Roberts et al. (2001) devel¬ 
oped a spectral mixture library using hyperspectral images to 
extract roads, but the use of spectral information alone does not 
capture the spatial properties of the curvilinear features of such 
images. White et al. (2012) used a LiDAR-derived DEM to map 
characteristics of forest roads located beneath a dense forest 
canopy. The position, gradient, and total length of a forest 
haul-road were accurately extracted using a 1-m DEM. 

Until recently, Support Vector Machine (SVM) was a popular 
approach for the classification of data used to extract roads (Go¬ 
mez et al. 2010; Matkan et al. 2009). 

The SVM classification technique has been increasingly ap¬ 
plied to classification of airborne imagery (Camps-Valls et al. 
2004; Melgani et al. 2004), where its higher accuracy compared 
to traditional techniques stems from its lower sensitivity to high 
dimensionality (Bazi and Melgani 2006). SVM is based on the 
statistical theory of learning, developed by Vapnik in 1998 (Cor¬ 
tes and Vapnik 1995). This theory provides a set of principles to 
be followed in order to find classifiers with good generalization, 
which is defined as the ability to correctly predict the class of 
new data in the same area where learning has occurred (Preme- 
bida et al. 2009). 

Song and Civco (2004) used a SVM to extract roads by clas¬ 
sifying them using LiDAR and obtained fine accuracy for clas¬ 
sifying rural and urban roads. Gomez et al. (2010) compared the 
analysis of SVM and Mahalanobis algorithms for road extraction 
from high resolution data. The SVM algorithm was superior to 
Mahalanobis distance and LiDAR was introduced into the im¬ 
proved classification process. 

The purpose of this study was to determine the suitability of 
LiDAR for extraction of forest roads. We classified LiDAR data 
using the SVM algorithm to extract a forest road and we evalu¬ 
ated the results. The paper is arranged as follows: 

First, the SVM used for classification is described. Second, 
image segmentation used for shape extraction is introduced. 
Third, an experiment showing the proposed approach to extract 
roads using Ikonos images is described. Finally, we offer our 
discussion and conclusions. 

Our objectives were to assess the accuracy of extraction of a 
forest road from LiDAR terrain data and compare the results 
with conventional centerline surveying. The road position was 
evaluated by the following criteria: (1) classification; determine 
the percentage of the road area that could be identified with the 
LiDAR data, (2) positional accuracy; determine the 95th percen¬ 
ts Springer 


tile horizontal distance separating the LiDAR-derived and 
field-surveyed centerline. 

Methods 

Study area and data sources 

The geographic setting for this study is a part of the Flyrcanian 
Forests in Golestan Province, Iran. The topography of the study 
area is steep and rugged with elevations ranging from 290 to 720 
m and ground surface slopes exceeding 3% and up to 65% in 
some areas. The forest road was built in 1990 and used to access 
the Shastkola forest for timber harvest, forest protection, field 
research and management. The average width of the road sur¬ 
face was 3.5 m. Some of the road was hidden under the forest 
canopy, which covered 40%-100% of the road and averaged 
80% coverage. 

Airborne LiDAR data were collected by the National Geo¬ 
graphical Organization of Iran (NGO), using a sensor mounted to 
a fixed-wing aircraft. The survey was conducted in October 2011 
and LiDAR data were included in the first and last returns of the 
distance and intensify data, the average density of the points on 
the surface being 4 points per square meter. 

Two vector layers, one of 87 survey control points, with 0.08 
m accuracy horizontally, and another, a road network (line for¬ 
mat) from previous studies in the same study area, was used to 
assess the newly extracted road. 

Preprocessing and preparation layers 

Random errors in the original LiDAR data can be caused by 
instruments such as cameras, GPS or non-surface features such 
as birds (Huising and Gomes-Pereira 1998). Consequently, to 
develop an accurate DEM, non-ground points must be removed 
prior to interpolation to a raster DEM (Shan and Sampath 2005; 
Zhang and Whitman 2005; Axelsson 1999). In our data, there 
were some LiDAR points with errors on cloud points, these 
points elevated the values unreasonably and were removed dur¬ 
ing preprocessing following Meng et al. (2009). One important 
decision to be made prior to ground filtering is the selection of 
the first or last returns of LiDAR for ground filtering (Hyyppa et 
al. 2003; Silvan-Cardenas et al. 2006). We selected the last return 
because the last pulses reach deeper in vegetated areas, and 
hence, the last ones are closer to the ground surface (Kraus and 
Rieger 1999; Okagawa 2001). These points from distance Li¬ 
DAR data were removed and Eq. 1 was used: 

First pulse - Last pulse > T 

( 1 ) 

Based on Eq. 1, the difference between the first and the last 
pulse will not be less than a certain threshold; if it be less than T, 
that point is removed. 

T was calculated from Eq. 2, a is the error in measuring the 
elevation in LiDAR data: 
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T> \V‘ First Pulse + <7- Last Pulse (2) 

Then, for recalculating the removed points, the IDW interpo¬ 
lation method was used, it being an interpolated method for cal¬ 
culation of the values assigned to unknown points in a set of 
points with a weighted average of the values available at the 
known points. In the next step, a Digital Terrain Model (DTM) 
and a Digital Surface Model (DSM) were generated. The last 
pulse is reflected from the ground surface, a fact important in a 
forested area, so the DTM was developed from the last pulse 
(Reutebuch et al. 2003). The DSM was developed from the first 
pulse data, which are reflected from the object level. To delete 
the topographic effect for road detection, the Digital Non-Terrain 
Model (DNTM) was used (Matkan et al. 2009). A DNTM layer 
was created from Eq. 3: 

DNTM= DSM-DTM (3) 

Support Vector Machine classification and assessments 

A Support Vector Machine (SVM) is basically a statistical linear 
learning machine based on the principle of optimal separation of 
classes (Watanachaturaporn 2005). SVM can identify a linear 
separation of the most favorable classes, provided the data are 
linearly separable. In this method, the training samples, describ¬ 
ing the edges of the class, are used to delineate the two classes by 
fitting an optimal separating hyperplane. In the next step, the 
LiDAR intensity data were classified using SVM. Then, LiDAR 
distance data were used in the classification to remove the pixels 
that had radiometric values similar to the road surface pixels 
other than road pixels. 

Assessment of the extracted road data was carried out by 
comparing the automatically extracted road centerlines to the 
manually plotted road axes used as reference data. Both data sets 
are given in the vector representation. Two steps were considered 
for the evaluation: (1) calculation of the number of extracted 
primitive roads using SVM algorithm that are matched to the 
number of reference network, and (2) calculation of quality 
measures (Wiedemann 2003). In the first step, the roads in both 
data sets were fragmented into short pieces of equal length. Then, 
a buffer of constant predefined width (3.5 m) was constructed 
around the reference road data. The parts of the extracted data 
within the buffer were considered matched if the difference of 
the direction between the reference road data and the part to be 
matched did not exceed a given threshold. This difference was 
derived directly from the vector representations of both roads. 
Following the notation of Bazi and Melgani (2006) and Balt- 
savias (1999), the matched extracted data were denoted as the 
true positive with length TP, emphasizing the fact that the ex¬ 
traction algorithm had indeed found a road. The unmatched ex¬ 
tracted data were denoted as false positives with length FP. 

In the second step, matching was performed the other way 
around. The buffer was constructed around the extracted road 
data, and the parts of the reference data lying in the buffer and 


fulfilling the direction constraint were considered matched. In 
cases of low redundancy their length was approximated by TP. 
The unmatched reference data were denoted as false negatives 
with length FN (Wiedemann et al. 1998). For comparison pur¬ 
poses, the road extraction was classified as TP, FN or FP on a 
pixel by pixel basis. 

The definitions of the quality measures are presented in the 
following (defined by Wiedemann et al. 1998): 

Completeness is the ratio of the records correctly extracted to 
the total number of relevant records within the ground-truth data 
(Eq. 4): 

Tp 

Complstansss = , Completeness e [0, 1] 

(4) 

Correctness is the ratio of the number of relevant records ex¬ 
tracted to the total number of relevant and irrelevant records 
retrieved (Eq. 5): 

TP 

Correctness = , Correctness £ [0, 1] 

(5) 

Quality is a measure of the goodness of the final result. It 
takes into account the completeness of the extracted data as well 
as their correctness, as defined in Eq. 6: 

Quality = Tp4 ^^ rN - Quality € [0,1] (6) 

Comparison of extracted road centerline with field-surveyed 
centerline 

The position of the LiDAR-derived centerline to the 
field-surveyed centerline location was compared using a simple 
method described by Goodchild and Hunter (1997). We extracted 
117 field-surveyed checkpoints for the exact location. This ap¬ 
proach compared a linear feature of high accuracy to a feature of 
lower accuracy, and detennined the percentage of the low accu¬ 
racy line that fell within a specified horizontal distance normal to 
the high accuracy line. The method was used to answer the fol¬ 
lowing question: what percentage of the digitized forest road fell 
within (x) meters nonnal to the surveyed road centerline? The 
overall positional accuracy was measured by 95% of the test line 
width. 

Results and Discussion 

In this study, precision of the LiDAR data for height was 15 cm, 
so we used Eq. 2: 

r>Vl5* + 15* = 21 cm (7) 

where, T was chosen as the threshold, considering this threshold 
would remove 585 false points from the LiDAR cloud points. 
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The removed points were less than 10% of the total points as 
calculated by the IDW interpolation method. 

The outcomes of the IDW displayed next, the interpolation 
accuracy for IDW interpolator and 1-meter cell resolution was 
assessed using the RMSE statistic: 

RMSE = ^w^~i!2“ r 8 tl “* a »'~ Z| y uth ^ i = 0.19 m 

"V M 

(8) 



(c) 


I 10 20 meten| 

Fig. 1: (a) DTM; (b) DSM; 

Evaluation of the results of the interpolation method showed 
that IDW was a suitable method for interpolating LiDAR data, as 
reported by Ali (2004), Blaschke (2004) and Podobnikar (2005), 
i.e., the IDW method perfonns well, if the density of sampling 
data is high. LiDAR data have high sampling density, so IDW is 
a suitable interpolator for a DEM generatws from LiDAR data 
(Liu et al. 2007). On the basis of the point density (four points 
per square meter), the RMS of IDW method was appropriate for 
this study. 

In the next step, the LiDAR intensity data as well as the 
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The size of a grid cell is commonly referred to as the grid 
cell’s resolution, with a smaller grid cell indicating a higher 
resolution. A grid cell resolution must be selected as part of the 
interpolation process; each interpolation technique is imple¬ 
mented using a user-selected grid cell resolution (Wechsler 
2007). Tlie output for the generated DTM layer from the last 
pulse after IDW interpolation, DSM and DNTM are displayed in 
Fig. 1. 



DNTM; (d) Intensity data 


merged data (intensity data and DNTM layer) were classified by 
SVM. Each pixel of the extracted road data in the previous stage 
was classified as true positive (TP), false negative (FN) or false 
positive (FP). The results are shown in Table 1. 


Table I: Support Vector Machine (SVM) classification matrix 



TP 

FP 

FN 

Intensity data 

28403 

38114 

19758 

Merged data 

36145 

21209 

12003 
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The road edges were well-defined in the LiDAR-cxtract layers 
and enabled an accurate digitizing of the centerline location. 
More than 95% of the LiDAR-derived road was digitized within 
1.3 m of the nonnal field-surveyed centerline. The remaining 
five percent of the road length was located further than 1.3 m 
from the surveyed centerline, though the maximum separation 
between the field-surveyed and digitized centerline did not ex¬ 
ceed 1.7 m. Part of this survey and the digitized road centerline is 
shown in Fig. 2. 



Fig. 2 : Field surveyed centerline buffer (green) and digitized centerline 
buffer (pink) 

Completeness, correctness, and quality values calculated for 
the extracted road are listed in Table 2. We developed the SVM 
approach for road extraction based on classification. It was easy 
to operate in road extraction in this experiment, regardless of the 
results. 


Table 2. Assessments of SVM classification 



Correctness 

Completeness 

Quality 

Intensity data 

42.70% 

58.97% 

32.92% 

Merged data 

63.02% 

75.07% 

52.11% 


Road position is an important inventory parameter for forest 
management assessment. Road position can be obtained accu¬ 
rately and efficiently using high-resolution LiDAR data, which 
reduces the need for field-based surveys for this basic parameter. 
Greater opportunities now exist for broad-scale analyses which 
incorporate thorough and accurate measurements of forest road 
systems. LiDAR-derived road data can address gaps that exist in 
current data sources, especially for forested areas, and represent 
a valuable tool to assess forest roads at scales not previously 
feasible (White et al. 2010). 

Conclusions 

Forest road maps can serve a variety of purposes for forest man¬ 
agers. Producing updated road maps is among the most valuable 
and easily attained products of LiDAR data analysis. In this 
study, forest roads were identified and some roads extracted, 
particularly those that did not have a pronounced topographic 


cross-sections or were difficult to identify using LiDAR surface 
grids, such as those used by White et al. (2010). In general, de¬ 
spite the large number of automated software designs, road ex¬ 
traction typically remains a manual process in the field of for¬ 
estry (Doucette et al. 2009). 

In this study, a simple and relatively efficient method was 
provided for extraction of forest road features, although we iden¬ 
tified some disadvantages. If automated road extraction were 
conducted for mapping larger areas, despite additional training to 
achieve improvements in efficiency, these techniques might yield 
substantial time-savings (White et al. 2010; Doucette et al. 2009). 
With LiDAR extract layers, forest managers can identify terrain 
conditions before entering the field, aiding the initial planning 
and wood transport layout. 

Access to accurate maps of forest roads can provide forest 
managers with good agreement between initial plans and actual 
field conditions, e.g., forest road maps can be used for harvest 
planning operations or for wildfire preparedness. 

The ability to extract road maps in forested areas can be sub¬ 
stantially improved by the proposed method as compared to use 
of traditional data sources. In addition to describing road features, 
road characteristics measured using LiDAR methods were highly 
accurate. 

The ±1.3 m positional accuracy for road features is a substan¬ 
tial improvement compared to the accuracy (±10 m) of tradi¬ 
tional data sources used to plot roads on the 1:25,000 topog¬ 
raphic maps in IRAN. A similar level of positional accuracy of 
one to two meters for an extracted road centerline was reported 
by Rieger et al. (1999) and White et al. (2010). The level of posi¬ 
tional accuracy needed for GIS datasets varies substantially with 
their intended use. For analysis conducted at the scale of this 
study, a simple criterion for horizontal accuracy was that the 
digitized centerline must lie within the width of the actual road 
bed. The maximum distance between the field-survey centerline 
and the digitized line was 1.3 m and was essentially one-half of 
the width of the Shastklola road. 
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